Taming verification hardness: an efficient algorithm for testing subgraph isomorphism

نویسندگان

  • Haichuan Shang
  • Ying Zhang
  • Xuemin Lin
  • Jeffrey Xu Yu
چکیده

Graphs are widely used to model complicated data semantics in many applications. In this paper, we aim to develop efficient techniques to retrieve graphs, containing a given query graph, from a large set of graphs. Considering the problem of testing subgraph isomorphism is generally NP-hard, most of the existing techniques are based on the framework of filtering-and-verification to reduce the precise computation costs; consequently various novel feature-based indexes have been developed. While the existing techniques work well for small query graphs, the verification phase becomes a bottleneck when the query graph size increases. Motivated by this, in the paper we firstly propose a novel and efficient algorithm for testing subgraph isomorphism, QuickSI. Secondly, we develop a new feature-based index technique to accommodate QuickSI in the filtering phase. Our extensive experiments on real and synthetic data demonstrate the efficiency and scalability of the proposed techniques, which significantly improve the existing techniques.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Taming Subgraph Isomorphism for RDF Query Processing

RDF data are used to model knowledge in various areas such as life sciences, Semantic Web, bioinformatics, and social graphs. The size of real RDF data reaches billions of triples. This calls for a framework for efficiently processing RDF data. The core function of processing RDF data is subgraph pattern matching. There have been two completely different directions for supporting efficient subg...

متن کامل

The Hardness of Subgraph Isomorphism

Subgraph Isomorphism is a very basic graph problem, where given two graphs G and H one is to check whether G is a subgraph of H . Despite its simple definition, the Subgraph Isomorphism problem turns out to be very broad, as it generalizes problems such as Clique, r-Coloring, Hamiltonicity, Set Packing and Bandwidth. However, for all of the mentioned problems 2O(n) time algorithms exist, so a n...

متن کامل

Efficient algorithms for supergraph query processing on graph databases

We study the problem of processing supergraph queries on graph databases. A graph database D is a large set of graphs. A supergraph query q on D is to retrieve all the graphs in D such that q is a supergraph of them. The large number of graphs in databases and the NP-completeness of subgraph isomorphism testing make it challenging to efficiently processing supergraph queries. In this paper, a n...

متن کامل

2 00 4 Isomorphic Implication ∗

We study the isomorphic implication problem for Boolean constraints. We show that this is a natural analog of the subgraph isomorphism problem. We prove that, depending on the set of constraints, this problem is in P, NP-complete, or NP-hard, coNP-hard, and in P || . We show how to extend the NP-hardness and coNP-hardness to P || -hardness for some cases, and conjecture that this can be done in...

متن کامل

A Minimal Rare Substructures-Based Model for Graph Database Indexing

Systems such as proteins, chemical compounds, and the Internet are stored as graph structures in graph databases. A basic, common problem in graph related applications is to find graph data that contains a query. It is not possible to scan the whole data in graph databases since subgraph isomorphism testing is an NP-complete problem. In recent years, some effective graphs indexes have been prop...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2008